Population Genetics

general approach

  1. evolution = changes in gene frequency
  2. gene frequency
    1. see Table 26-1, p. 781
    2. make sure you understand the calculation of allele frequencies from genotype frequencies
  3. genotype frequency
    1. see Table 26-1, p. 781
  4. population
    1. all individuals of a given species in a local area that can mate with one another
    2. represented as a point in gene frequency space
    3. often though of as a gene pool
  5. population variation
    1. genetic distance
      1. example from AIDS virus HIV
    2. heterozygosity
      1. measure of genetic variation
      2. two ways to calculate heterozygosity from gametic types
        1. Table 26-2, p. 782
        2. MISTAKE IN TABLE 26-2: for English population the frequency of the Ns type should be 0.39 NOT 0.29
        3. use allele freqs. to calc. avg. freq. heterozygotes at the two loci
          1. calc. gene frequency
          2. calc. expected heterozygotess
          3. average the freq. of expected heterozygotes
        4. from gamete freqs. directly
          1. use gamete freq. to calc. zygote freqs.
          2. include zygotes that are heterozygous for at least one of the loci

Hardy-Weinberg law

  1. basic problem
    1. gene freqs --> genotype freqs
    2. genotype freqs --> gene freqs (is easy)
  2. background
    1. don't worry about this stuff
    2. Castle in 1903
    3. denied by Yule 1908
      1. consider a dominant and recessive allele A and a
      2. consider a cross between pure types: aa X AA
      3. F1 cross:
        1. cross AA x aa ­> Aa
        2. phenotype ratio 1:1 ­> 1:0
      4. F2 cross:
        1. cross Aa x Aa ­> .25 AA .50 Aa .25 aa
        2. phenotype ratio 1:0 ­> 3 : 1
      5. Yule's conclusions
        1. ratio was the only stable ratio for a dominant gene and would be reached from any initial frequency
        2. reasonable to suppose that the dominant gene will take over and that ultimately a population's phenotypes will be due to dominant genes with recessive genes existing as rare mutants
        3. this implies that phenotypes would exist not only because of their effects on fitness but also because of Mendelian ratios (depending on dominance, etc)
        4. this has far reaching implications for evolution
        5. instead Mendelism is more neutral
  3. assumptions
    1. random mating
      1. random union of gametes
      2. use Punnett square
        1. see calculations in text
      3. multiply gene freqs to get expected genotype freqs
    2. no selection
    3. no migration
    4. no mutation
    5. infinite population size
  4. MN blood group
    1. background
      1. one of many red­blood cell antigen­antibody systems
      2. two antigens: M and N
      3. single locus with two codominant alleles
      4. genotypes: MM, MN, NN
      5. phenotypes: M, MN, N
    2. observed and expected frequencies
      1. see Table 26-1
  5. general calculation
    1. genotype freqs in present generation
      1. AA: fAA
      2. Aa: fAa
      3. aa: faa
      4. may not be HW freqs
    2. gene freqs in present generation from genotype frequencies
      1. p = fAA + 0.5 fAa
      2. q = 1 ­ p = faa + 0.5 fAa
    3. genotype freqs in next generation
      1. AA: p2
      2. Aa: 2pq
      3. aa: q2
      4. these HW freqs are stable from then on
    4. gene freqs next generation
      1. pnext gen = p2 + .5 * 2pq = p
      2. qnext gen = q2 + .5 * 2pq = q
      3. gene freqs don't change
  6. overview of Hardy Weinberg Law
    1. HW law
      1. conclusions
        1. gene frequency equilibrium every generation
        2. genotype frequency equilibrium attained in a single generation
      2. forces
        1. mutation
        2. selection
        3. migration
        4. drift
    2. analogous to Newton's first law
      1. if no forces operate on an object, the velocity and direction of motion is constant
      2. forces
        1. weak
        2. strong
        3. electromagnetic
        4. etc.
    3. analogous to Malthus' law
      1. if no forces operate on a population its per capita growth rate is constant, r
      2. forces
        1. competition
        2. predation
        3. etc.
    4. provides foundation for model building and explanations
    5. provides a null hypothesis
  7. consequences of HW law for rare alleles
    1. freq hets >> freq homo for rare alleles: 2pq >> q2
    2. deleterious recessive genes
    3. cystic fibrosis
      1. gene
        1. autosomal recessive gene
        2. DNA sequence has open reading frame
      2. protein
        1. aa
        2. defective beta glucuronidase
      3. phenotype
        1. abnormal gladular secretions, death before 20
        2. mucus builds up ­> suffocation
      4. q = freq of allele = .024
      5. 2pq = .047 = 1/21
      6. q2 = 1/1700
    4. each of us has about 10 such alleles masked
    5. outcrossing functions to mask these alleles
  8. testing for fit to HW
    1. observed versus expected frequencies
    2. describe 2 test
      1. numbers of observed and expected types in each class
      2. degrees of freedom
        1. number of classes - number of items calculated from data
  9. two loci
    1. don't worry about this stuff
    2. consider two loci: A, a and B, b
      1. r = recombination rate
      2. p and q freqs of A and a
      3. r and s freqs of B and b
    3. gamete types:AB, Ab, aB, ab
    4. gamete freq:x1x2x3x4
    5. gene freqs
      1. p = x1 + x2, q = x3 + x4
      2. r = x1 + x3, s = x2 + x4
    6. linkage disequilibrium: D = x1x4 ­ x2x3
    7. how to get from gene freqs to gamete freqs?
      1. x1 = pr + D
      2. x2 = ps ­ D
      3. x3 = qr ­ D
      4. x4 = qs + D
    8. Dt+1 = (1 ­ r ) Dt
    9. conclusions
      1. gene freqs constant
      2. gamete & zygotic freqs approach HW with time
      3. D goes to zero with time at rate r
    10. linkage disequilibrium affects selection

evolutionary forces

  1. mutation
  2. migration
  3. selection
    1. fitness
    2. selection coefficient
  4. chance; genetic drift

mutation

  1. random with regard to fitness affect
  2. A ­> a at rate m, p freq A, q freq a
  3. pt = pt­1(1 ­ m)
  4. for a single locus: m = 10­5
  5. weak force in changing gene freqs
    1. see Figure 26-9 in text
    2. although weak force in changing gene frequencies mutation is still important in evolution
    3. recall Figure from lecture on mutation and selection
    4. all variation ultimately stems from mutation

natural selection

  1. propositions about nature (Darwin)
  2. variability
  3. heritability
  4. struggle to survive
  5. conclusion: natural selection must occur

sickle cell anemia

  1. background
    1. we will use this as an example of natural selection in human populations
    2. sickle cell anemia
      1. common in West and Central Africa
      2. gene freq reaches 16% in Africa
      3. up to 5% of the population dies in infancy
      4. gene freq is about 5% for black Americans
      5. geographic distribution correlated with malaria
      6. why is the gene so widespread?
    3. hemoglobin molecule
      1. transport O2 in blood
      2. a locus, ß locus
      3. tetramer = 2a + 2ß
  2. variation; alleles at b locus
    1. A = wild type, in freq p
    2. S = sickle cell allele, in freq q
    3. C = recessive allele, in freq r
  3. "struggle to survive"
    1. data from West-African population
    2. relative and absolute fitness
      1. absolute fitness is expected numbers of offspring
      2. to obtain relative fitness divide absolute fitnesses by that of standard
      3. consider an example, letting Aa be the standard

        genotypeabs fitness rel fitness
        AA90.9
        Aa101
        aa20.2

      4. we will use relative fitness by letting genotype AS be the standard

      genotypefitness condition
      AA0.9malarial susceptible
      AS1.0malarial resistance
      SS0.2anemia
      AC0.9malarial susceptible
      SC0.7anemia
      CC1.3malarial resistance

    3. questions
      1. S in high freqs of 0.10­0.16, why?
      2. why doesn't C allele increase?
    4. gene freq equations
      1. Dq =(pqwAS + q2wSS + qrwSC)/wavg - q
      2. wavg = average fitness
        1. = W with over bar on p. 758
      3. equations on p. 758 are for two alleles
      4. Dq = qaS/wavg
        1. aS = pwAS + qwSS + rwSC - wavg
        2. aS = avg effect of the S allele on fitness
          1. depends on fitness
          2. depends on gene and genotype frequencies
        3. sign of Dq depends on sign of aS
      5. Dr = raC/wavg, similarly
        1. aC = pwAC + qwSC + rwCC - wavg
    5. application of gene freq equation using aS
      1. pre-malaria African population
        1. malaria initially rare
          1. malaria was not important selective agent initially
          2. slash-and burn agriculture facilitates malaria
          3. African slash-and-burn agriculture began about 200 BC
          4. inc number of breeding places for mosquitos
        2. fixed for A (p=1, q=0, r=0)
      2. consider rare mutations from A to S or C
        1. only heterozygote effect is important
      3. after malaria
        1. wavg = 0.9
        2. aS = (1)(1.0) - 0.9 = .1 > 0
          1. conclusion: S allele increases rapidly
        3. aC = (1)(0.9) - 0.9 = 0
          1. conclusion: no selection for C
          2. homozygous effect not important for rare alleles in outcrossing populations
          3. however, if inbreeding occurred aC > 0
      4. at equilibrium
        1. peq = 0.89, qeq = 0.11, req = 0
        2. aS = 0
        3. aC = -0.03
        4. wavg = 0.91 << woptimal = wCC = 1.3
    6. conclusion
      1. evolution not optimal
      2. evolution does not produce what is best for the species
      3. natural selection has many components
    7. natural selection has many components
      1. fitness
        1. intrinsic aspects
        2. environment
      2. mating system
      3. genetic system
      4. population structure

    genetic structure in populations

    1. genetic variation
      1. how is it organized?
        1. within populations
          1. between individuals
        2. between populations
      2. human racial groups
        1. some loci differ between races
          1. Table 26-9, p. 788
          2. Duffy
          3. still all races have all alleles
        2. genetic variation at most loci do not follow racial boundaries
          1. Table 26-8, p. 787
          2. Fig 26-5, p. 788
        3. there is as much variation within races as between
        4. there is no genetic basis for racial groups
    2. causes of population structure
      1. geographical subdivision
        1. lakes, streams, mountains
      2. resource distribution
      3. isolation by distance
    3. consequences of population structure
      1. non-random mating
      2. inbreeding
      3. inc. genetic relatedness
      4. dec. heterozygosity
      5. inc. homozygosity
      6. finite population structure
      7. genetic drift
    4. inbreeding
      1. mating between relatives
      2. identity by descent, ibd
        1. a kind of identity
        2. copies of the same DNA sequence in a common ancestor
        3. example of ibd
          1. A*A x Aa
          2. here are their offspring: .25 A*A, .25 A*a, .25 AA, .25 Aa
          3. consider a sib mating: A*A x A*a
          4. here are the offspring of the sib mating: A*A*, .25 AA*, .25 A*a, .25 Aa
          5. the AA offspring contains identical alleles
          6. the A*A* offspring contains identical alleles that are also identical by descent, that is, identical by being copies of the allele A* in the A*A grandparent
      3. inbreeding coefficient, F
        1. F = prob two alleles ibd
          1. alleles at same locus
          2. alleles picked from same individual
      4. effect of inbreeding on a single population
        1. consider population with F

          AAAa aa
          1-Fp22pq q2
          Fp q
          (1-F)p2+pF (1-F)2pq(1-F)q2+qF

        2. dec heterozygosity
          1. HetF = (1-F) 2pq
        3. increase in homosygosity
        4. gene freqs unaffected by F
        5. example: self fertilization
          1. see figure 26-12 of text p. 796
          2. note: gene frequencies stay the same
      5. effect of inbreeding in a subdivided populations
        1. big population made up of many smaller populations
        2. Fig. 26-13, p. 797
        3. different populations fix (become homozygous) for different alleles
        4. heterozygosity lost according to formula on p. 796
          1. Ht = H0 (1 - 1/(2N))t
      6. why 1/(2N)?
        1. consider diploid population of size N: the chance of ibd is 1/(2N)
        2. for simplicity ignore the issue of separate male and female sexes
        3. each individual produces k gametes
        4. random mating among these kN gametes produced
        5. pick a gamete
        6. what is the chance another gamete has a copy of the same allele ibd to the one just picked?
          1. there are (k/2)-1 other gametes that have alleles ibd to the one just picked or approximately k/2 for k large
          2. so the chance of ibd is (k/2)/(kN) = 1/(2N)
      7. coefficient of kinship, r
        1. measure of genetic relatedness
        2. used in calculations of inclusive fitness
        3. r = prob two alleles ibd
          1. alleles at same locus
          2. alleles picked from two different individuals
        4. = F of a hypothetical offspring of the two individuals
      8. inbreeding depression
        1. lower fitness of inbred offspring
        2. fact of nature
        3. dominance hypothesis: masking of recessive deleterious alleles
        4. overdominance: superior fitness of heterozygote
        5. application: evolution of sex
          1. sex
            1. recombination
            2. outcrossing
          2. outcrossing maintained by inbreeding depression
      9. genetic drift
        1. = changes in gene freq due to sampling error
        2. sampling error is larger in small populations
        3. consider a very small sample of size 1
          1. if gene frequency is .5
          2. there is only a .50 chance of maintaining that freq, that is of picking a heterozygote
          3. there is a .25 chance of the gene freq being 1 in the sample (AA)
          4. there is a .25 chance of the gene freq being 0 in the sample (aa)
        4. many mutations are neutral in their effects on fitness and their evolution is determined by genetic drift